Bayesian Mean Field Algorithms for Neural Networks and Gaussian Processes Most Importantly, I Wish to Thank Mette and Rebecca for Their Love and Inspiration

نویسنده

  • Ole Winther
چکیده

The subject of this thesis is the derivation and study of Bayesian mean eld learning algorithms for feed-forward neural networks and Gaussian processes. In Bayes learning our posterior beliefs { based upon our a priori knowledge and past observations { are expressed as probabilities. Making predictions on new observations requires averaging over this posterior probability distribution. Mean eld techniques developed within the statistical physics of disordered systems are employed here to compute these averages. The resulting Bayesian algorithms are expressed as an extensive set of non-linear mean eld equations which may be solved by iteration. Two di erent formalisms are used to derive the mean eld equations, the cavity method and a saddle-point method. In the latter the mean eld equations are derived from the saddle-point of a variational mean eld free energy. Two di erent mean eld free energies are studied, a naive and the so-called TAP mean eld free energy. The cavity method and TAP saddle-point method give equivalent mean eld equations, although the derivation di ers. The naive and TAP approach give similar results in simulations, but the latter has the advantage that it may be analyzed theoretically, and one may derive estimators of the generalization error from the mean eld theory. The aims of this work are twofold. Firstly, to gain theoretical into insight how Bayes algorithm infers a rule given by a neural network. Besides deriving algorithms, the mean eld techniques may be used to derive the expected generalization error of the algorithms for learning scenarios in the thermodynamic limit. The results found show ne agreement between the average case analysis and simulations for learning scenarios in the simple perceptron and in the committee machine. Also Bayesian online and query algorithms are derived and studied theoretically. The second aim is to derive mean eld algorithms for use on real data. The mean eld algorithm is derived for Gaussian processes. This choice is very exible because, depending on the speci cation of the covariance function, di erent models may be tested. For example, one choice corresponds to the simple perceptron. The mean eld algorithms are tested on three small benchmark data sets for various covariance functions and the performances are found to be similar to the state of the art. Preface This thesis summarizes the results of my doctoral research. The work has been carried out at the Computational Neural Network Center, connect, The Niels Bohr Institute, University of Copenhagen and at Theoretical Physics, University of Lund. The thesis consists of two parts, an essay and a collection of reprinted papers. The rst part, as well as serving as an introduction to the reprinted paper, contains new results building upon the results of the papers. I wish to thank both my advisor, Benny Lautrup and my supervisor in Lund, Carsten Peterson for their guidance. I wish to thank everybody at connect and in the group in Lund for providing me with good places to do research. It has been a pleasure to work with my collaborators S ren Halkj r, Benny Lautrup, Manfred Opper, Sara S. Solla and Jian-Bo Zhang. A special thanks to Manfred Opper for his guidance and generosity in sharing his great insight into the eld. This work has been supported by the Danish Natural Science Council and the Danish Technical Research Council through connect. Most importantly, I wish to thank Mette and Rebecca for their love and inspiration. O.W. Copenhagen, Denmark May 1998 i

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Construction cost estimation of spherical storage tanks: artificial neural networks and hybrid regression—GA algorithms

One of the most important processes in the early stages of construction projects is to estimate the cost involved. This process involves a wide range of uncertainties, which make it a challenging task. Because of unknown issues, using the experience of the experts or looking for similar cases are the conventional methods to deal with cost estimation. The current study presents data-driven metho...

متن کامل

DIFFERENT NEURAL NETWORKS AND MODAL TREE METHOD FOR PREDICTING ULTIMATE BEARING CAPACITY OF PILES

The prediction of the ultimate bearing capacity of the pile under axial load is one of the important issues for many researches in the field of geotechnical engineering. In recent years, the use of computational intelligence techniques such as different methods of artificial neural network has been developed in terms of physical and numerical modeling aspects. In this study, a database of 100 p...

متن کامل

Application of Artificial Neural Networks and Support Vector Machines for carbonate pores size estimation from 3D seismic data

This paper proposes a method for the prediction of pore size values in hydrocarbon reservoirs using 3D seismic data. To this end, an actual carbonate oil field in the south-western part ofIranwas selected. Taking real geological conditions into account, different models of reservoir were constructed for a range of viable pore size values.  Seismic surveying was performed next on these models. F...

متن کامل

Forecasting of heavy metals concentration in groundwater resources of Asadabad plain using artificial neural network approach

Nowadays 90% of the required water of Iran is secured with groundwater resources and forecasting of pollutants content in these resources is vital. Therefore, this research aimed to develop and employ the feedforward artificial neural network (ANN) to forecast the arsenic (As), lead (Pb), and zinc (Zn) concentration in groundwater resources of Asadabad plain. In this research, the ANN models we...

متن کامل

Novel Radial Basis Function Neural Networks based on Probabilistic Evolutionary and Gaussian Mixture Model for Satellites Optimum Selection

In this study, two novel learning algorithms have been applied on Radial Basis Function Neural Network (RBFNN) to approximate the functions with high non-linear order. The Probabilistic Evolutionary (PE) and Gaussian Mixture Model (GMM) techniques are proposed to significantly minimize the error functions. The main idea is concerning the various strategies to optimize the procedure of Gradient ...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 1998